|
The FMA instruction set is an extension to the 128 and 256-bit Streaming SIMD Extensions instructions in the x86 microprocessor instruction set to perform fused multiply–add (FMA) operations.〔"FMA3 and FMA4 are not instruction sets, they are individual instructions -- fused multiply add. They could be quite useful depending on how Intel and AMD implement them" 〕 There are two variants: * FMA4 is supported in AMD processors starting with the Bulldozer architecture. FMA4 was realized in hardware before FMA3. * FMA3 is supported in AMD processors starting with the Piledriver architecture and Intel starting with Haswell processors and Broadwell processors since 2014. ==New instructions== FMA3 and FMA4 instructions have almost identical functionality but are not compatible. Both contain fused multiply–add (FMA) instructions for floating point scalar and SIMD operations, but FMA3 instructions have three operands while FMA4 ones have four. The FMA operation has the form ''d'' = round(''a'' × ''b'' + ''c'') where the round function performs a rounding to allow the result to fit within the destination register if there are too many significant bits to fit within the destination. The 4-operand form (FMA4) allows ''a'', ''b'', ''c'' and ''d'' to be four different registers, while the 3-operand form (FMA3) requires that ''d'' be the same register as ''a'', ''b'' or ''c''. The 3-operand form makes the code shorter and the hardware implementation slightly simpler while the 4-operand form provides more programming flexibility. See XOP instruction set for more discussion of compatibility issues between Intel and AMD. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「FMA instruction set」の詳細全文を読む スポンサード リンク
|